Summer 2015 Education/Research Project at Cornell University

2016-05-11

Time

July 5-31, 2015

Number of Participants

33

Course

Brief Intro

The long-awaited 2015 summer semester began for 33 computer science juniors with a 17-hour flight to Ithaca, NY, the home of Cornell University. They were excited to finally arrive at the Ivy League school where they would spend an intensive month taking classes and doing research. This was the second year of the summer vacation research project co-hosted by Zhiyuan College and the Cornell Department of Computer Science(CS). Cornell offered two specialized credit courses: Programming Languages and Professional Practice II.

Besides the two courses offered, the CS department invited 13 professors to give seminars during the summer program. Through these lectures students were exposed to cutting-edge scientific knowledge and were able to talk face-to-face afterwards with the scholars giving the talks. Chinese and American students openly discussed issues around science, life and culture. After each seminar, all had lunch together which deepened their friendship. The four weeks of study were intense. In order to balance study and rest, as well as to fully experience life at Cornell, the summer program held various extracurricular activities: Tour to Niagara Falls, underground collider lab visit, ice cream social, pizza parties at Hopcrofts’ and Gries’ homes, bowling night, community music concert, ziplining, etc. These events became beautiful and refreshing memories from this busy summer.

Academic Report Abstract

1:00pm, July 7: Lecture by Fred B. Schneider

  • Introduction
    1)Good theory takes practice;
    2)Theoretical systems;
    3)Active vibration control system in airplanes uses a lot of gas in 1980s;
    4)Reliable computers in airplanes needs to be designed;
    5)SRI in Stanford.

  • Definition
    1)Byzantine fault model: a model where anything goes;
    2)t-Byzantine fault tolerant machine: a machine which can tolerate t Byzantine faults.

  • Integrity constraints
    1)IC1: All non faulty agree on transmitter’s value;
    2)IC2: If transmitter is non-faulty, then all non-faulty agree on the same value as the transmission had.

  • Theorem 1 No solution to Byzantine problem for n = 3, t = 1.
  • Theorem 2 For n ≤ 3t, no solution to t-tolerant Byzantine problem with n processes

1:45pm, July 14: Lecture by Ken Birman

  • High Assurance
    1)Heavily tested;
    2)Fault tolerated;
    3)Perhaps secure;
    4)Consistency;
    5)Automatically adaptive;
    6)Fast & scalable.

  • Research Goals
    1) Better tools for high assurance cloud computing;
    2)Work with real users to demonstrate impact on nationally important, mission critical scenarios;
    3)Push performance to the outer limits.

  • EXAMPLE: A “SMART” POWER GRID
    1)Monitoring and dynamic control are vital to optimizing power delivery;
    2)Renewables(solar, wind) are especially hard to manage, since they fluctuate a lot;
    3)Recovery after major storm damage requires smart planning based on the state of the grid.

  • Machine Intelligence / A.I.
    1) In fact these are exactly the same kinds of problems that researchers in ML/A.I. explore;
    2) Grid cloud enable new forms of distributed collaboration tool.

  • Features of live distributed system varied.There are many challenges

  • ISIS2 IS AT THE CORE
    1)Group communication system;
    2)We use it in many ways and benefit from its strong security;
    3)But speed has been a concern.

  • ISIS2 System
    1)Core functionality: fault-toleration speed;
    2)Intend for use in very hard large-scale setting;
    3)The local object instance functions as a gate way;
    4)Read-only operations performed on local state;
    5)Update consistency; 6)Make the developer’s life easier; 7)Virtual consistency model.

  • Other ways to use ISIS2
    1)CloudMake. HDRTFS. RDMA.

1:00pm, July 15: Lecture by Robbert van Renesse

  • Transformations
    The last decade or two have seen a wild growth of large-scale systems that have strongresponsiveness requirements. Such systems include cloud services as well as sensor networks.Their scalability and reliability requirements mandate that these systems are both sharded andreplicated. Also, these systems evolve quickly as a result of changes in workload, addingfunctionality, deploying new hardware, and so on. While these systems are useful, they canbehave in erratic ways and it is not clear that one can build mission- and life-critical systems this way.

  • Supercloud: Combining Cloud Resources into a Single Cloud
    Infrastructure as a Service (IaaS) clouds couple applications tightly with the underlyinginfrastructures and services. This vendor lock-in problem forces users to apply ad-hoc deploymentstrategies in order to tolerate cloud failures, and limits the ability of doing virtual machine (VM) migration and resource scaling across different clouds, and even within availability zones of thesame cloud.

  • GridControl: Monitoring and Control of the Smart Power Grid
    There are pressing economic and environmental arguments for the overhaul of the current power grid, and its replacement with a Smart Grid that integrates new kinds of green power generatingsystems, monitors power use, and adapts consumption to match power costs and system load.Many promising power management ideas demand scalability of a kind that seem a good fit for cloud computing, but also have requirements (real-time, consistency, privacy, security, etc.) thatcloud computing does not currently support.

  • Configuring Distributed Computations
    Configuring large distributed computations is a challenging task. Efficient use of a cluster requires configuration tuning and optimization based on careful examination of application and hardwareproperties. Considering the large number of parameters and impracticality of using trial and error ina production environment, cluster operators tend to make these decisions based on their experience and rules of thumb. Such configurations can lead to underutilized and costly clusters.

  • Reliable Broadcast for Geographically Dispersed Datacenters
    Sprinkler is a reliable high-throughput broadcast facility for geographically dispersed datacenters.For scaling cloud services, datacenters use caching throughout their infrastructure. Sprinkler canbe used to broadcast update events that invalidate cache entries. The number of recipients canscale to many thousands in such scenarios. The Sprinkler infrastructure consists of two layers: onelayer to disseminate events among datacenters, and a second layer to disseminate events amongmachines within a datacenter. A novel garbage collection interface is introduced to save storagespace and network bandwidth.

  • Heterogeneous Failure Models
    The robustness of distributed systems is usually phrased in terms of the number of failures ofcertain types that they can withstand. However, these failure models are too crude to describe thedifferent kinds of trust and expectations of participants in the modern world of complex, integratedsystems extending across different owners, networks, and administrative domains. Modernsystems often exist in an environment of heterogeneous trust, in which different participants may have different opinions about the trustworthiness of other nodes, and a single participant may consider other nodes to differ in their trustworthiness.

  • Correct-by-construction Fault-tolerant Distributed Systems
    Fault-tolerant distributed systems, algorithms, and protocols are notoriously hard to build. Goingfrom a specification to an implementation involves many subtle steps that are easy to get wrong.Moreover, fault-tolerant algorithms require many sources of diversity in order to make them survivea variety of failures. Programmers, even different ones, tend to make the same mistakes whenthey are implementing code from a specification.

1:00pm, July 21: Lecture by Joseph Halpern

  • Nash equilibrium is commonly used in game theory. However, using Nash equilibrium could result in unreasonable answers (e.g. prisoner’s dilemma). In this talk, Prof. Halpern talked about three refinements to address the problems of Nash equilibrium.
    1)K-resilient and t-immune equilibrium enhances the robustness against faulty and unexpected behavior of agents;
    2)Including the computation cost in the utility functions avoids unreasonable answers in Bayesian Games and repeated prisoner’s dilemma;
    3)Using augmented games handles situations where the structure of the game may not be the common knowledge.

2:45pm, July 21: Lecture by Dexter Kozen

  • Abstraction
    1)Introduce an informal view of coinduction and illustrate its use in math¬ematical arguments;
    2)Argue that coindcution is not only about bisimilarity and equality of be¬haviors, butalso applicable toavarietyoffunctions andrelationsdefinedon coinductive datatypes;
    3)Aim to promote the principle as a useful tool for the working mathemati-cian and to bring it to a level of familarity on par with induction.

  • Talk Outline
    1)Induction;
    2)Induction vs Coinduction;
    3)Lexicographic Order on Streams;
    4)Transitivity of< lex;
    5)Where is the Basis?
    6)Monotonicity of + on Streams;
    7)Coalgebras and Coalgebra Homomorphisms;
    8)Final Coalgebras;
    9)Recursive Types;
    10)Bisimulations and Bisimilarity;
    11)Other Signatures;
    12)A Coinductive Proof Principle;
    13)Conclusion.

The abstract above is provided by Qizhe Xie, Xutong Chen, Yiding Feng, Zhuoyue Feng, Chao Liao etc.